-
Notifications
You must be signed in to change notification settings - Fork 26
[v1] remove v0 code #344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1] remove v0 code #344
Conversation
Signed-off-by: Yannick Schnider <[email protected]>
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. Or this can be done with Now you are good to go 🚀 |
vllm_spyre/platform.py
Outdated
| if is_decoder and not envs.VLLM_USE_V1: | ||
| raise ValueError("Decoder models are only supported on v1") | ||
| if not envs.VLLM_USE_V1: | ||
| raise ValueError("vllm-spyre is only supported with vLLM v1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise ValueError("vllm-spyre is only supported with vLLM v1") | |
| raise ValueError("vllm-spyre is only supported with vLLM v1. Please set VLLM_USE_V1=1") |
tests/e2e/test_spyre_embeddings.py
Outdated
|
|
||
| monkeypatch.setenv("VLLM_SPYRE_DYNAMO_BACKEND", backend) | ||
| monkeypatch.setenv("VLLM_USE_V1", "1" if vllm_version == "V1" else "0") | ||
| monkeypatch.setenv("VLLM_USE_V1", "1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can replace all of these with one os.environ["VLLM_USE_V1"] = 1 in conftest.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this env var is not needed anymore right. Should we not add an assert in the llm engine to verify that it is an instance of the the V1 Engine class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have this check in the platform.py. Don't you think that is enough and we can safely remove all monkeypatch.setenv("VLLM_USE_V1", "1")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at some point envs.VLLM_USE_V1 will be removed. Also, this flag doesn't make sure that the current vLLM instance is not running as V0 as vLLM currently can fallback to V0 if VLLM_USE_V1 is unset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check this flag with os.getenv instead? In this way the plugin won't crash when envs.VLLM_USE_V1 is removed upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have VLLM_USE_V1 in other places in the code. These will have to be removed anyway when the envs.VLLM_USE_V1 is removed upstream...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But aren't they removed on this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that the only remaining occurrences were for a hack testing both v0 and v1 engines. As we don't use that anymore, I removed it and incorporated you check via os.getenv. see this commit.
|
Nice! |
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
|
I did a very radical change and removed all lines where we set |
|
Yeah I think this is good enough then, this prevents anybody from running with |
|
@maxdebayser do we need to do a performance comparison of embeddings on v0 vs v1 first before deleting? Or are we good to go? |
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
|
@joerunde , I think we're good to go, I can run the V0 tests on a frozen version. |
|
bot:test |
2 similar comments
|
bot:test |
|
bot:test |
Signed-off-by: Yannick Schnider <[email protected]>
|
bot:test |
1 similar comment
|
bot:test |
| | Speculative Decoding | 🗓️ | | | ||
| | Guided Decoding | 🗓️ | | | ||
| | Pooling | ⚠️ | Works with V0. V1 still being developed in vLLM [vllm#18052](https://github.com/vllm-project/vllm/issues/18052) | | ||
| | Pooling | ✅ | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have Embedding models at the end of this table - is that still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question for @maxdebayser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't support all pooling applications, I think it's better to remove this and leave just Embedding below.
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
negative diff let's goooo
Signed-off-by: Yannick Schnider <[email protected]>
### [docs] remove pooling models from supported features Following up a (late) discussion in #344 about removing the pooling models from the list of supported models as not all all pooling applications are supported, and we already have embedding models in that list (see comment by @maxdebayser : [link](https://github.com/vllm-project/vllm-spyre/pull/344/files#r2247822334)) --------- Signed-off-by: Yannick Schnider <[email protected]>
[v1] remove v0 code
Now as we have v1 support for embedding models (#277 ), we can finally delete the v0 code.
Note: for decoder models v0 support was depreciated some time ago.